Zoomed Ranking: Selection of Classification Algorithms Based on Relevant Performance Information
نویسندگان
چکیده
Given the wide variety of available classiication algorithms and the volume of data today's organizations need to analyze, the selection of the right algorithm to use on a new problem is an important issue. In this paper we present a combination of techniques to address this problem. The rst one, zooming, analyzes a given dataset and selects relevant (similar) datasets that were processed by the candidate algoritms in the past. This process is based on the concept of \distance", calculated on the basis of several dataset characteristics. The information about the performance of the candidate algorithms on the selected datasets is then processed by a second technique, a ranking method. Such a method uses performance information to generate advice in the form of a ranking, indicating which algorithms should be applied in which order. Here we propose the adjusted ratio of ratios ranking method. This method takes into account not only accuracy but also the time performance of the candidate algorithms. The generalization power of this ranking method is analyzed. For this purpose, an appropriate methodology is deened. The experimental results indicate that on average better results are obtained with zooming than without it.
منابع مشابه
Negative Selection Based Data Classification with Flexible Boundaries
One of the most important artificial immune algorithms is negative selection algorithm, which is an anomaly detection and pattern recognition technique; however, recent research has shown the successful application of this algorithm in data classification. Most of the negative selection methods consider deterministic boundaries to distinguish between self and non-self-spaces. In this paper, two...
متن کاملA New Hybrid Method for Improving the Performance of Myocardial Infarction Prediction
Abstract Introduction: Myocardial Infarction, also known as heart attack, normally occurs due to such causes as smoking, family history, diabetes, and so on. It is recognized as one of the leading causes of death in the world. Therefore, the present study aimed to evaluate the performance of classification models in order to predict Myocardial Infarction, using a feature selection method tha...
متن کاملA Solution to View Management to Build a Data Warehouse
Several techniques exist to select and materialize a proper set of data in a suitable structure that manage the queries submitted to the online analytical processing systems. These techniques are called view management techniques, which consist of three research areas: 1) view selection to materialize, 2) query processing and rewriting using the materialized views, and 3) maintaining materializ...
متن کاملZoomed Ranking: Selection of Classiication Algorithms Based on Relevant Performance Information
The need for methods which would assist the user in selecting classiication algorithms for a new problem has frequently been recognized as an important issue. Previous meta-learning approaches to algorithm selection consist of suggesting one algorithm or a small group of algorithms that are expected to perform well on the given problem 1, 3]. We believe that a more informative and exible soluti...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کامل